In-silico modelling studies of 5-benzyl-4-thiazolinone derivatives as influenza neuraminidase inhibitors via 2D-QSAR, 3D-QSAR, molecular docking, and ADMET predictions

Influenza virus disease is one of the most infectious diseases responsible for many human deaths, and the high mutability of the virus causes drug resistance effects in recent times. As such, it became necessary to explore more inhibitors that could avert future influenza pandemics. The present research utilized some in-silico modelling concepts such as 2D-QSAR, 3D-QSAR, molecular docking simulation, and ADMET predictions on some 5-benzyl-4-thiazolinone derivatives as influenza neuraminidase (NA) inhibitors. The 2D-QSAR modelling results revealed GFA-MLR (Rtrain2 =0.8414, Q2 = 0.7680) and GFA-ANN (Rtrain 2 =0.8754, Q2 = 0.8753) models with the most relevant descriptors (MATS3i, SpMax5_Bhe, minsOH and VE3_D) for predicting the inhibitory activities of the molecules which has passed the global criteria of accepting QSAR models. The results of the 3D-QSAR modelling results showed that CoMFA_ES (Rtrain 2 =0.9030, Q2 = 0.5390) and CoMSIA_EA (Rtrain2 =0.880, Q2 = 0.547) models are having good predicting ability among other developed models. The molecules were virtually screened via molecular docking simulation with the active site of NA protein receptor (pH1N1) which confirms their resilient potency when compared with zanamivir standard drug. Molecule 11 as the most potent molecule formed more H-bond interactions with the key residues such as TRP178, ARG152, ARG292, ARG371, and TYR406 that triggered the catalytic reactions for NA inhibition. Furthermore, six (6) molecules (9, 10, 11, 17, 22, and 31) with relatively high inhibitory activities and docking scores were identified as the possible leads for in-silico exploration of novel NA inhibitors. The drug-likeness and ADMET predictions of the lead molecules revealed non-violation of Lipinski’s rule and good pharmacokinetic profiles respectively, which are important guidelines for rational drug design. Hence, the outcome of this study overlaid a solid foundation for the in-silico design and exploration of novel NA inhibitors with improved potency.


Introduction
Influenza virus disease remains one of the major health menaces affecting humans because of its high mortality and morbidity rates in recent times even with the devastating Covid-19 pandemic (Akhtar et al., 2021). The World Health Organization (WHO) reported about 2-5 million cases of severe illness caused by the ravaging seasonal influenza virus epidemic which resulted in over 500,000 deaths globally (Korsten et al., 2021). These flu epidemics cause severe respiratory infections in children, adults, the elderly, and people with underlying health conditions (Aleebrahim-Dehkordi et al., 2022). Influenza virus neuraminidase (NA) is an enzyme that catalyzes the obliteration of terminal sialic acid residues (sialidase) which aids in liberating new virions formed from the infected cells and circulating to infect the neighboring cells (Abed et al., 2016). As such, the NA inhibition can defend the host from being infected and prevent its proliferation (Avila et al., 2020). Due to the highly preserved active site structure of neuraminidase (Adams et al., 2019), it has become an attractive molecular target for the exploration and development of novel anti-influenza inhibitors. Nowadays, zanamivir (Relenza ™ ), oseltamivir (Tamiflu ™ ), laninamivir octanoate (Inavir ™ ), and peramivir (Rapivab ™ ) are the four (4) approved neuraminidase inhibitors for influenza treatment (Hayden et al., 2022). Although, there is a lot of concern for the advent of drug resistance effects resulting from the high variability of the influenza virus (Abed et al., 2016). Hence, there is a need to explore more anti-influenza drugs that have more potent efficiency and binding modes with safer side effects than the currently available drugs. The compounds of thiazolidin-4-onealso known as 4-thiazolinones were reported to have an extensive biological activity range such as anti-fungal, anti-inflammatory, anti-cancer, anti-bacterial, and anti-viral amongst others (Xiao et al., 2021). The trial and error approach applied in the development of new drugs has been seen to be very tedious, costly, and time-consuming (Abdullahi et al., 2021). Hence, the validation of reported anti-influenza agents with the aim of improve them remains an area of high research interest. The application of some in-silico modelling concepts such as quantitative structure-activity relationship (2D-QSAR and 3D-QSAR), molecular docking, and ADMET studies can save time of filteringhits-to-leads, and reduce the cost of synthesizing newly potent drugs (Al-Attraqchi and Venugopala, 2020). In this study, the in-silico modelling concepts such as 2D-QSAR, 3D-QSAR, molecular docking, and ADMET predictions were applied to identify probable lead candidates for the future in-silico design of new anti-influenza agents with improved bioactivities. The 2D-QSAR and 3D-QSAR modelling studies were carried out on 48 compounds of 5-benzyl-4-thiazolinone to predict their anti-influenza activity using some computed structural features of the compounds in numerical values (molecular descriptors). The 2D-QSAR model was initially constructed based on genetic function approximation-multi-linear regression (GFA-MLR) and the subsequent artificial neural regression (GFA-ANN) analysis. The 3D-QSAR models were constructed based on the comparative molecular field analysis (CoMFA) and comparative similarity indices analysis (CoMSIA) methods for the activity prediction of the molecules. The 48 molecules were virtually screened with the NA protein target to identify possible lead candidates based on their docking scores and residual interactions via molecular docking studies. Finally, the physicochemical properties of the molecules were generated to study their drug-likeness and pharmacokinetic profiles.

Data set collection and biological activities (pIC 50 )
A data set containing 48 molecules of 5-benzyl-4-thiazolinone derivatives as inhibitors of influenza (H1N1) neuraminidase (NA) were obtained from the previously published work of Xiao et al. (2021). The reported NA inhibitory activities of the molecules evaluated in IC 50 (μM) were further transformed to a logarithmic scale (pIC 50 ¼ Àlog IC 50 Â 10 À6 ) to eliminate data skewness for the QSAR studies (Abdullahi et al., 2020b). Thirty-five (35) molecules were used as a training set while the remaining 13 molecules were used as the test set as presented in Table 1.

Generation of 2D descriptors
The chemical structures of the 48 molecules in the dataset were accurately drawn using ChemDraw software, then exported to Spartan 14 for initial geometry minimization using molecular mechanics force fields (MMFF). The minimized structures were further optimized at the DFT level (B3LYP/631G**) to determine their most stable conformation using Spartan 14 (Abdullahi et al., 2020a). PaDEL Descriptor tool was then used to calculate about 2000 molecular descriptors from the optimized structures. These molecular descriptors are computed based on their steric potentials, electronic properties, potential hydrogen bonds of path length, relative ionization, and hydrophobicity which are functions of bioactivities (Ahamad et al., 2019).

Data pretreatment and division
The calculated molecular descriptors were pretreated to generate a robust model by removing highly inter-correlated and redundant values  (Apablaza et al., 2017). Furthermore, the pretreated descriptors were divided into training (model development set) and test set (model validation set) using Kennard and Stone's algorithm.

Construction of the 2D-QSAR models
The construction of the QSAR model was established by employing the Material Studio software version 8.0. The Genetic Function Approximation (GFA) was used for the feature selection of the pretreated descriptors in the training set for model development (Abdullahi et al., 2020a), and Friedman Lack-of-Fit (LOF) was selected as the functionality while the scaled LOF smoothness parameter was set at the default of 0.5. The population sample was set to 10,000 at 1000 maximum generations and the number of top equations return was set to 1 (Umar et al., 2019). The descriptor matrix of the best-built model was initially subjected to the Y-Randomization test as a measure to attest to the quality of the model before being exported to Molegro Data Modeller (MDM) software for the development of the MLR and ANN models (Poleboyina et al., 2022). The MLR model assumes that the dependent variable (y) is a linear function of the independent variables (x i ), as shown in Eq. (1).
Where y represents the inhibitory activity (pIC 50 ), x i represents the molecular descriptors, c i is the regression coefficient in the linear model, and c 0 is the intercept of the equation (constant). On the other hand, artificial neural networks are simulated by the biological neural networks of the real world. The artificial systems of the neural networks are simulating the function of the human brain which has shown good performance on classification and regression problems (Darnag et al., 2017). These networks are developed by representing each independent variable as a neuron in the input layer, and connections are formed to other neurons in the next layer (hidden layer). Each of the connections is multiplied by the neuron output layer by a weighted score (Bouakkadia et al., 2021). Hence, the output layer (pIC 50 ) is computed by applying a sigmoid function and summing all inputs as shown in Eqs. (2) and (3) respectively.

Model applicability domain (AD)
The model applicability domain is the theoretical chemical space of the compounds defined by the descriptors and the modeled activity in which the acceptable 2D-QSAR model can make reliable predictions (Abdullahi et al., 2020b). Thus, the technique helps in detecting the structural and response outliers in the training and test set respectively. Furthermore, the leverage approach was utilized to assess the chemical space of the best QSAR model, and the plot of standardized residuals against leverage values (h) also known as the Williams plot was used to assess the chemical space (Ibrahim et al., 2020). As such, compounds with leverage scores less than the threshold (h < h*) and standardized residual scores within AE3.0σ (standard deviation unit) are set to have fallen in the model's chemical space or applicability domain. The warning leverage (h*) is calculated using Eq. (4): Where d is the number of descriptors in the model and N is the number of compounds used as the training set.

Molecular minimization and alignment
The optimized structures were minimized with Gasteiger-Huckel atomic charges of Tripos force field based on Powell conjugate gradient algorithm method at convergence criteria of 0.05 kcal/(mol Å) and 1000 maximum iterations to determine their steady conformation using Sybyl-X 2.1.1 program (Abdizadeh et al., 2017). The molecular alignment of a database is one of the most crucial steps for building a reliable and predictive 3D-QSAR model. Distill rigid alignment method was used to align the molecules in the database to the most potent compound in the dataset (molecule no. 11) as the template with the common core or backbone produced as shown in Figure 1.

Development of 3D QSAR models
The CoMFA and CoMSIA methods were used to generate the 3D-QSAR models of the NA inhibitors using SYBYL-X 2.1.1 software (Tripos Inc) to explain the relationship between the inhibitory activity (pIC 50 ) as the dependent variable and the 3D structure of molecules (Vishwakarma and Bhatt, 2021). The descriptor parameters of the built CoMFA model were electrostatic (E) and steric (S) energy values at a point in space surrounding the molecules while the CoMSIA model was built with more additional field descriptors such as steric(S), electrostatic (E), hydrophobic (H), hydrogen bond donor (HBD) field, hydrogen bond acceptor (HBA) for both training and test set (Goudzal et al., 2022).

Internal and external validation of the 2D and 3D QSAR models
For the 2D-QSAR modelling, the GFA-MLR model building protocol of Material Studio was initially used for the feature selection of the best descriptors matrix (independent variable) to predict the dependent variable (pIC 50 ) (Abdullahi et al., 2020a). Subsequently, molegro data modeller (MDM) was used to generate statistical validation metrics for the 2D-QSAR model based on MLR and ANN regression analysis (Vyas and Georrge, 2015). The prediction capability of the 2D-QSAR models generated was assessed using internal and external validation parameters such as Pearson correlation coefficient (R), adjusted R 2 , Spearman's rank correlation coefficient (ρ), cross-validated regression coefficient (Q 2 ), root mean square error (RMSE), predicted coefficient of determination for the test set ðR 2 Pred Þ, regression coefficients for the training and test set (R 2 train and R 2 test Þ, and coefficient of determination of Y-randomization ðCR 2 p Þ whose calculation formulas are shown in Eqs. (5), (6), (7), (8), (9), (10), (11), and (12).
The correlation coefficient squared (R 2 ) is often used to describe relationships between two variables whose range of value is between 0 and 1. N is the number of compounds in the training set as data points and p is the number of descriptors in the built model. d i is the difference between the ranks of corresponding values predicted and actual responses. Y predict is the predicted response activity, Y actual is the actual response activity, Y mean is the mean value of the actual response activity, and the numerator is PRESS, Y ðtestÞ actual is the actual response activity of the test set, Y ðtestÞ predict is the predicted response activity of the test set, Y ðtrainingÞ mean is the actual mean response activity of the training set, c is the number of components, and R r is the average R of random models (Roy et al., 2016;Roy et al., 2015a, b).
On the other hand, the 3D-QSAR models were constructed by correlating the latent components from the set of available CoMFA and CoMSIA descriptors (independent variable) with the inhibitory activities of the molecules through the partial least squares (PLS) regression analysis (Aouidate et al., 2018). The predictive performance of both models was assessed based on some prominent internal and external validation metrics such as cross-validated (Q 2 ), cross-validated standard error of estimate (SEE), regression coefficients for the training and test set (R 2 train & R 2 test Þ, and predicted coefficient of determination for the test set   Tropsha, 2010;Vucicevic et al., 2019). A concise workflow diagram displaying the general overview processes adopted in this study for the construction of the 2D and 3D-QSAR models was shown in Figure 2.

Molecular docking investigation
A molecular docking investigation was carried out to further predict the potential bindings between neuraminidase protein and the 48 molecules of 5-benzyl-4-thiazolinone derivatives using molecular operating environment (MOE) V2015.10 software. The crystal structure of the 2009 influenza pandemic H1N1 (pH1N1) neuraminidase complexed with zanamivir (PDB: 3TI5) was selected as the template protein while the co-crystalized ligand was used as the reference drug (Vavricka et al., 2011).

Energy minimization and the docking protocol
The 5-benzyl-4-thiazolinone molecules were initially imported to the MOE builder interface for energy minimization and conformational searching until an RMS distance of 0.1 Å and RMSD gradient of 0.01 kcal/ mol were computed for each molecule using Amber12: ET force field. The database of the molecules was later saved for the docking protocol (Ahmed et al., 2021). The pH1N1 neuraminidase was also minimized by fixing all hydrogen atoms, lone pairs, and partial charges accordingly. To increase the docking accuracy, the co-crystalized ligand was initially re-docked to analyze the ligand-active pocket interactions, and the RMSD scores were calculated between the docked poses and the co-crystalized ligand. The MOE program was adjusted to the triangle matcher method based on the London dG scoring function for placement selection and induced fit with GBVI/WSA dG scoring function for refinement before the docking starts (Shakour et al., 2021). The database of ligands in MDB file docked (zanamivir and the thiazolinones) were then imported, and the docking simulation was executed for screening based on their binding scores.

Drug likeness and ADMET prediction studies
The preliminary estimation of physicochemical, drug-likeness and pharmacokinetic parameters of potential drug candidates is crucial, especially at the initial stage of the drug discovery process which helps in rolling out unfavourable effects of the candidates . The pharmacokinetic parameters are based on desirable adsorption, distribution, metabolism, excretion, and toxicity (ADMET) of the query drug when administered into the body (Ibrahim et al., 2020). An efficient and accurate ADMETlab 2.0 webserver (https://admetmesh.scbdd.com/) was utilized to predict numerous physicochemical, drug-likeness, pharmacokinetic, and toxicity parameters of molecules in the study (Babalola et al., 2022;Kar et al., 2022). In addition, the drug-likeness of the 5-benzyl-4-thiazolinones (48 molecules) was assessed based on Lipinski, Ghose, Veber, Egan, and Muegge rules using the SwissADME online webserver at http://www.swissadme.ch/index.php.

2D-QSAR models validation results
The 2D-QSAR modelling approach was successfully carried out on the 5-benzyl-4-thiazolinone derivatives as novel inhibitors against the H1N1 neuraminidase target. The built 2D-QSAR model in this study was used to predict anti-influenza response activities with the influence of some robust and statistically significant descriptors (Ibrahim et al., 2020). As mentioned earlier, the GFA model building protocol of Material Studio software was used for the selection of the best model descriptors. Subsequently, a 2D-QSAR model was successfully derived using MDM software based on the MLR and the subsequent ANN regression modelling with the five (5) best descriptors whose model internal and external validation metrics were shown in Tables 2 and 3 respectively.
The proposed 2D QSAR model equation was given below as Eq. (13); Inhibitory activity ðpIC 50 Þ ¼ À 0:665892 * ATSC6c þ 0:584588 * MATS3i À 1:24459 * SpMax5 Bhe þ 0:0236721 * minsOH þ 0:0326316 * VE3 D þ 8:75957 (13) The prediction quality of the GFA-MLR model was assessed using the validation metrics such as correlation coefficient (R 2 ) of 0.8414, adjusted R 2 of 0.8140, and cross-validation correlation coefficient (Q 2 ) of 0.7680 for the training set, R 2 test of 0.6975, and R 2 pred of 0.6034 accordingly. The Y-Randomization test was ascertained by randomly scrambling of the inhibitory activity (y) and the model descriptors of the training set are kept constant which resulted in the construction of random models (Roy et al., 2016). The 50 random models were generated with low R 2 and Q 2 scores which attested that the original model is robust and not constructed by chance (Roy et al., 2015b). The coefficient of determination  for the Y-randomization test (cR 2 p Þ was computed as 0.7581 (!0.5) which confirmed the reliability of the model generated as shown in Table 4. Hence, the validation parameters generated in this study were within the acceptable threshold parameters as reported in the previous QSAR researches (Tropsha, 2010). Using the same five (5) subset descriptors as the input layer and a single hidden layer with 3 neurons (Figure 3), a GFA-ANN (5-3-1) regression model was constructed to examine the best non-linear relationship between the response activities and the model variables. The predicted inhibitory activity values of the molecules constitute a single output layer which is computed using a sigmoid transfer function. The model was built at default parameter set up in the MDM tool (max epochs ¼ 1000, momentum ¼ 0.2, general/output layer learning rate ¼ 0.3). The result indicates an improved statistical parameter of validation such as R 2 (training set) of 0.8754, cross-validation (Q 2 ) of 0.8753, RMSE of 0.0711, R 2 test of 0.7003, and R 2 pred of 0.5899 as shown in Tables 2 and 3 accordingly. The description name of the model descriptors coded as, ATSC6c, MATS3i, SpMax5_Bhe, minsOH, and VE3_D, along with their computed numerical scores were shown in Tables 5 and 6, while the correlation statistical parameters such as correlation coefficient, VIF, and mean effect of the descriptors were shown in Table 7. The Pearson correlation of less than 0.5 signifies that there is no inter-correlation among each descriptor  pair. The variance inflation factor (VIF) was calculated using Eq. (14), where R 2 is the Pearson correlation coefficient.
The VIF was computed for each model descriptor, and their scores fall within the threshold limit (VIF< 10) suggesting void multicollinearity which implies that each descriptor is orthogonal to one another (Thompson et al., 2017).
The relative importance of the model descriptors based on their magnitude and direction were computed via the mean effect (ME) approach using Eq. (15).
Where βi represents the coefficient of the descriptor i and Di represent each descriptor score for a molecule and n represents the number of training set molecules (Wang et al., 2018). It was observed that the SpMax5_Bhe descriptor has the highest positive mean effect score of 0.9381. This suggested that SpMax5_Bhe has the greatest degree of contribution towards the activity predictions. The predictive potency of the 2D-QSAR models was demonstrated through the graphical scatter plots of experimental versus predicted response activity as shown in    Model error predictions (residuals) consist of three (3) important components such as random (variance), systematic (bias), and measurement (noise) error, but models are more affected by systematic errors (Shirvani and Fassihi, 2021). Therefore, a model with high systematic error should be reconstructed again to reduce the high level of bias. This is because bias redirects the data into an artificial course that could lead to the wrong interpretation (Umar et al., 2019). The predictive competency of the 2D-QSAR model to predict the reported experimental response activities without any computational errors was determined by exploring the residual plot as shown in Figure 5. Since all the residual values fall within the definite threshold of AE2.5. Hence, it implies that the model is free of systematic error and can give a good prediction.

Interpretation of the molecular descriptors
ATSC6c descriptor is the centered Broto-Moreau autocorrelation-lag 6/weighted by charges which belongs to the spatial autocorrelation parameter in molecular graph theory. It is a 2D-autocorrelation descriptor relating to atomic properties, such as charges, polarizability, electronegativity, and atomic masses (Gonçalves et al., 2020). MATS3i is a 2D-autocorrelations descriptor (Moran autocorrelation-lag3/weighted by first ionization potential) that corresponds with the molecular topology, structural features, and atomic properties such as carbon-scaled atomic polarizability, intrinsic state, and van der Waals volume (Altaf et al., 2022). It has a negative coefficient in the model, which implies that the first ionization potential decreases as the inhibitory activity score increases. SpMax5_Bhe descriptor is computed by determining the largest absolute eigenvalue of a modified Burden matrix (n ¼ 5) weighted by relative Sanderson electronegativities (Ibrahim et al., 2020). It has the greatest degree of contribution with a positive influence on bioactivity due to its positive mean effect. The descriptor is related to the positive influence of the electronegative features on bioactivity.
The mins-OH is the minimum atom-type E-state of -OH groups in a molecular graph. It is a 2D descriptor that represents the electrotopological state for atom type which provides a more accurate and chemically expressive meaning to the role of functional groups, such as  -OH, in molecules. Hence, it can be easily described as the addition of -OH groups to the molecular graph increases, the polarity of molecules which makes it more hydrophilic and prevent its cellular membrane uptake (Sanyal et al., 2019). VE3_D is the logarithmic coefficient sum of the last eigenvector from the Barysz matrix (n ¼ 3) weighted by atomic number. The Barysz distance matrix is derived from a vertex and edge-weighted molecular graph that can identify the presence of atoms with more electronegativity (Sanyal et al., 2019).

Model applicability domain of the 2D-QSAR model
The applicability domain of the 2D-QSAR model in chemical space where the model can make a reliable prediction based on the five (5) defined model descriptors stated earlier. In this study, the leverage approach was applied to examine the chemical space. The standardized residuals computed by the GFA-ANN model were plotted against the leverage values for all molecules (Williams plot) to identify the response and structural outliers as presented in Figure 6. Interestingly, all molecules in the dataset were observed to be confined within the standardized residual threshold limit of AE2.5 while, only molecules 4, 8, 47, and 48 were seen to exceed the threshold leverage (h*) of 0.514. Thus, these influential compounds cannot be considered as templates for designing more prominent molecules with enhanced activities.

3D-QSAR models validation results
The CoMFA and CoMSIA models were generated for the 48 molecules of 5-benzyl-4-thiazolinone derivatives as anti-influenza inhibitors targeting the H1N1 subtype. The optimized structures were automatically split based on the structural diversity method into model building set (35 as training set) and test set (13 validation set) using Sybyl-X 2.1.1. The validation parameters for all possible CoMFA models were shown in Table 8. Among the possible CoMFA models generated, the CoMFA_ES model with acceptable statistical validation metrics as Q 2 (0.5390),  Table 9. Furthermore, four (4) CoMSIA models were identified as the best models out of the possible models generated because of their acceptable validation parameters. However, the CoMSIA_EA model was selected as the best model with the computed validation metrics (Q 2 ¼ 0.547 with 4 components, the R 2 value of 0.888, and a relatively low SEE value of 0.0673). In addition, the validations metrics of the best CoMFA and CoMSIA models (Table 10) were found within the benchmark scores for an acceptable QSAR model that was proposed by Alexander Golbraikh and Alexander Tropsha (Q 2 > 0.5 and R 2 > 0.6). This implies that the validation metrics of the models generated are statistically reliable which indicates their predictive potential and robustness (Shirvani and Fassihi, 2021).
The scatter plots of experimental against predicted inhibitory activities (pIC 50 ) for the training and test set molecules of both CoMFA and CoMSIA models revealed a good linear correlation, as presented in Figures 7a and 7b respectively. Also, the systematic error predictions in the models were assessed using the standardized residual versus experimental inhibitory activity plots as shown in Figs 8a-b. Hence, the standardized residual plots for both 3D-QSAR models revealed a random scattering above and below the zero baselines which is an indication of the non-existence of systematic (bias) error in the studies.

Contour map analysis of the CoMFA and CoMSIA models
The contour map interpretation of the CoMFA and CoMSIA models generated is one of the most attractive aspects of the 3D-QSAR analysis which involves the visualization of all favoured and disfavoured regions of the molecular fields surrounding the molecules as bioactivities functions (ElMchichi et al., 2020). Molecule 11 with the highest activity was chosen as a template to examine the most prominent field contributions for the studied dataset. The CoMFA steric and electrostatic contour maps of molecule 11 were shown in Figs. 9A-B. The green contours signified that the desirable addition of bulky groups in the regions would increase the activity, while the yellow contours depicted that bulky groups are undesirable in the region for increasing activity (Gu et al., 2021). Green contours are predominantly distributed on the ethyl acetate and 2-methoxy substituents of molecule 11, proposing that the introduction of bulkier groups in these regions would have enhanced the activity. Meanwhile, the yellow contours were distributed around the oxygen of the same methoxy substituent of the molecule, indicating that further attachment of bulkier fragments to the group's region would cause a decrease in the activity of the molecule. The electrostatic contour maps for the CoMFA model revealed the red contour regions where negatively charged groups are desirable and the blue contour areas are positively charged favoured regions for increasing activity. The red contour maps were observed near the ends of the substituents such as oxygens of ethyl acetate, methyl group attached to the thiazole moiety, hydroxyl (4-OH) and methoxy (3-OCH 3 ) groups of the benzene moiety of molecule 11, while the central surface of the molecule was enclosed with blue contours. The blue contour fields indicate that the further addition of positively charged substituents could lead to an increase in the activity of the molecule.
Furthermore, the red contours around the molecular ends suggest that more electronegative substituents should be attached to the regions for enhanced activity. However, these predictions were consistent with the electrostatic contour map for the CoMSIA_EA model generated. The contour maps for the CoMSIA_EA model with electrostatic and hydrogen bond acceptor field contributions were shown in Figs. 10A-B. From the HBA contours map, the magenta contours reveal the favorable HBA regions while the red contours show the unfavorable HBA regions. As such, the red contour was observed near the oxygen (-O-) of the ethyl acetate substituent attached to thiazole moiety while the magenta contours were observed near carbonyl oxygen (C¼O) of the ethyl acetate, and the hydroxyl oxygen (4-OH) of benzene moiety of the same molecule.
The general depiction of the contour maps from the CoMFA and CoMSIA models were summarized in Figure 11. The outcome of these analysis reaffirmed the largest influence of the 4-position substituents of thiazole on the activities as reported by Xiao et al. (2021). However, the addition of more substituted benzyl methylene with more electronegative and steric groups to the 5-position of the 4-thiazolinone ring may enhance the inhibitory activity of the molecules.

Molecular docking studies
Molecular docking simulation is an important molecular modelling strategy in the computer-assisted design of new molecules (structurebased drug design) which evaluates the binding interactions between the small molecules (ligands) and the targeted protein (Ibrahim et al., 2020). The 48 molecules of the dataset were docked into the active pocket of pH1N1 neuraminidase (PDB: 3TI5) using MOE software. The co-crystalized zanamivir of the protein was extracted and re-docked into the binding pocket to confirm the reliability of the docking algorithm used by the MOE program as well as to note the amino acid residues surrounding the ligand. The binding affinities of the molecules and Figure 11. Summary of the 3D-QSAR analysis.
zanamivir were summarized in Table 11. The results revealed the docking energies of the best poses ranging from À8.2964 to À6.4392 kcal/mol which confirms their robust potency in comparison with zanamivir, indicating that most of the studied molecules have higher binding capacity than zanamivir. However, six (6) leads (9, 10, 11, 17, 22, and 31) with relatively high activity (pIC 50 ! 4.50) and docking score (!À7.5 kcal/mol) were identified as the possible lead candidates for future exploration of improved anti-influenza agents. Molecule 11 as the most potent molecule with the highest pIC 50 of 4.8841 revealed a good docking score of À7.5022 kcal/mol, and the residual profile the complex formed was visualized in Figure 12.
The active site of the pH1N1 neuraminidase (PDB:3TI5) protein contains 19 conserved residues of amino acids where eight among them (ARG118, ARG152, ARG224, ARG292, ARG371, ASP151, GLU276, and Score: the final docking score, rmsd_refine: the root means square deviation between the pose before and after refinement, E_conf: the energy of the conformer. E_refine: core from the refinement stage, calculated to be the sum of the van der Waals electrostatics and solvation energies, under the Generalized Born solvation model (GB/VI), E_score1: Score from rescoring stages 1, E_place: Score from the placement stage, E_score2: Score from rescoring stages 2.
TYR406) are highly conserved and can directly interact with the substrate. The remaining 11 amino acid residues (GLU119, ARG156, TRP178, SER179, ASN294, ILE222, GLU227, THR225, GLU277, LEU134, and SER246) behave as scaffolds for the stabilization of the active site conformation. The best binding modes of molecule 11 in the binding site (11-complex) revealed four (4) conventional H-bonds, three (3) carbon-H bonds, seven (7) electrostatic, four (4) hydrophobic, and two (2) other interaction types with different active amino acid residues as summarized in Table 12. The analysis of the residual interactions revealed 4 complicated networks of hydrogen bond interactions with  the amino acid residues of A: ARG118, A: ARG371, A: ARG371, and A: GLU227 at different bond distances of 3.0003 Å, 2.7684 Å, 2.3066 Å, and 1.8321 Å respectively. The molecule formed 5 different electrostatic interactions due to strong attractive positive charges from the active residues such as A: ARG118, A: ARG292, and A: ARG371, while π-cation and π-anion interactions were conversely formed with A: ARG371 and A: GLU277 respectively. For the hydrophobic interaction, the 2-EtO group of the molecule interacts with the alkyl group of ILE149 to form one (1) alkyl-alkyl hydrophobic interaction type at a distance of 4.3738 Å, while the remaining 2-hydrophobic interactions formed were as a result of the π-alkyl interaction type with ARG225 and PRO431 residues at 5.4696 Å and 4.4297 Å interaction distances respectively. The residual interactions between the thiazole moiety of molecule 11 and highly conserved amino acids in this study are weaker than that of zanamivir as observed in other related studies (Lu and Chong, 2012;Meng et al., 2016;Selvaraj et al., 2020;Wang et al., 2017;Xiao et al., 2021). Also, this is in agreement with our 3D-QSAR studies suggesting that the thiazole moiety may not have a significant effect on NA Figure 13. Physicochemical radar chart of the selected lead compounds in the dataset.
inhibitory activity. The conventional hydrogen bond interactions with ARG118, GLU227, and ARG371 significantly made 5-benzylidenethiazolinone moiety more stable in both SA-cavity and 430-loop cavity accordingly. This tenders an important route for the exploration of more highly potent NA inhibitors targeting both 150 and 430-loop cavities respectively.

Drug likeness assessment and ADMET prediction
In assessing the drug-likeness of the molecules, the physicochemical parameters of the molecules are usually related to some filter variants. Therefore, relevant physicochemical parameters ( Figure 13) generated from the ADMETlab 2.0 web server were assessed using numerous drug-likeness rules such as Lipinski's rule, Pfizer's rule, GSK's rule, and Golden triangle rule as presented in Table 13. The physicochemical properties for the six lead compounds (9, 10, 11, 17, 22, 31) are within the upper limit (brown) and lower limit (red) as presented in the radar charts accordingly ( Figure 13). The six lead molecules which have passed the Lipinski rule of five (Table 14) were further assessed with other drug-likeness filter rules such as the Ghose filter rule, Veber's rule, Egan's rule, and Muegge's rule using the SwissADME web server as shown in Table 15. Lipinski's criteria for drug-likeness include molecular weight (MW 500 g/mol), n-octanol/water distribution coefficient (Log P 5), number of hydrogen bond acceptors (nHA 10), and number of hydrogen bond donors (nHD 5) (Ahmed et al., 2022;Chauhan et al., 2014). From Lipinski's table of the lead molecules in the data set, the Log P scores of the molecules are relatively close to 3 log mol/L (optimal limit) except for molecule 17 which is slightly above the optimal limit (0 < Log P < 3). This implies that the molecules have low aqueous solubility and good oral bioavailability (Adianingsih et al., 2022;Ar amburo-G alvez et al., 2022). The Log P also gives information on the cellular membrane permeability and hydrophobic binding to macromolecules such as the target receptors, plasma proteins, metabolizing enzymes, or transporters (Dowdy et al., 2022). Zanamivir and oseltamivir standard neuraminidase drugs have lower Log P scores of 1.013 and -1.317 which tend to experience difficulty in penetrating the lipid bilayer of the cell membrane.
The lead molecules were also appraised for their quantitative estimate of drug-likeness (QED), synthetic accessibility scores, and other drug-likeness rules such as Ghose, Veber, Egan, and Muegge rules. The QED scores of the lead molecules are less than 0.67 signifying unattractive measures as drug-like compounds (Bickerton et al., 2012), but they are predicted to be easily synthesized due to their low SA scores of less than 6. The lead molecules were able to satisfy the Ghose rule only among the other drug-likeness rules as shown in Table 15.
The biochemical processes involved from the administration of a drug into the body to its elimination play an important role in lead identification and optimization (Hossen et al., 2022). A perfect drug candidate must be non-toxic, and when administered should be absorbed into the circulatory system and eradicated without affecting the biological activity (Hossen et al., 2022). These discrete biochemical processes are closely interrelated leading to the evaluation of ADMET properties as one of the prime factors in the process of drug discovery (Xu, 2022). The six lead molecules were further subjected to the ADMET prediction using the ADMETlab 2.0 web server as mentioned earlier. Some of the relevant computed ADMET parameters include human intestinal absorption (HIA), human colon adenocarcinoma cell lines (Caco-2) permeability, MadinÀDarby Canine Kidney cells (MDCK) permeability, plasma glycoprotein (Pgp) inhibitor, plasma glycoprotein (Pgp) substrate, plasma protein binding (PPB), volume distribution (VD), blood-brain barrier (BBB) penetration, human cytochromes (CYP), clearance (CL), half-life (T1/2), AMES toxicity, carcinogenicity (Carc), eye irritation (EI), respiratory toxicity (RT) as shown in Table 16. The Caco-2 cell permeability has been an important index for an eligible drug candidate which is associated with human intestinal absorption (Ahmed et al., 2022). The lead compounds were considered to have proper Caco-2 cell permeability because their values are higher than the optimal score of 5.15 log cm/s. The computed values for HIA showed that molecules 9 and 10 have excellent probability of absorption in the intestinal membrane. Molecules 11 and 31 were predicted to have moderate absorption, while molecules 17 and 22 tend to be poorly absorbed (>0.7). The MDCK permeability is utilized as an in-vitro model for permeability screening, and its apparent coefficient is used to assess the efficiency of chemicals in the body, and also to estimate the effect of the blood-brain barrier. The lead compounds were considered to have high passive MDCK permeability with predicted coefficients of greater than 2.0 Â 10 À5 cm/s. The output results of the lead molecules revealed an excellent probability of being Pgp non-inhibitor and Pgp non-substrate whose range of values are within 0 and 1. PPB is one of the most important mechanisms of drug uptake and distribution resulting from the drug-protein bindings in the plasma which strongly affects the pharmacodynamics behavior of the drug (Xiong et al., 2021). The lead compounds were also predicted to have a high value of PPB (>90%) depicting a broad therapeutic index. The theoretical concept of the VD parameter is used to relate the administered drug dose with the actual initial concentration in the circulatory system which often describes the in-vivo distribution (Xiong et al., 2021). As such, the lead molecules are predicted to have proper VD values in the range of 0.04-20 L/kg. The BBB permeate output of the lead molecules was classified as BBB þ category whose predicted values are > -1. For the metabolism, the predicted outputs revealed the probabilities of being either lead substrates or inhibitors of CYPs of the isoenzymes (1A2, 3A4, 2C9, 2C19, and 2D6) whose range of values is within 0-1. The clearance of a drug (CL) is an important pharmacokinetic measure that describes how the drug is excreted from the body. The predicted Key: Molecular weight (MW), n-octanol/water distribution coefficient (Log P), number of hydrogens bond acceptors (nHA), number of hydrogen bond donors (nHD), Number of Lipinski violations (nLV). Key: Quantitative Estimate of Drug-likeness (QED), Synthetic accessibility (SA) score.
clearance penetration results of the lead molecules showed that only molecules 17 and 31 with the probability values of 6.945 and 6.521 mg/min/kg respectively, tend to have moderate clearance (<5 mg/min/kg), while other molecules (9, 10, 11 and 22) are predicted to have low clearance levels. In terms of toxicity, the AMES mutagenicity, eye irritation, and respiratory toxicity of the lead compounds were all predicted as non-toxic which is in agreement with the previous reports by Xiao et al. (2021).

Conclusion
In conclusion, the study utilized some in-silico modelling concepts on 48 analogs of 4-thiazolinone as influenza neuraminidase inhibitors. The 2D-QSAR models (GFA-MLR and GFA-ANN) models with the featureselected descriptors as MATS3i, SpMax5_Bhe, minsOH, and VE3_D were adopted for predicting NA inhibitory activity (pIC 50 ) of the molecules. Similarly, the accepted 3D-QSAR models (CoMFA_ES and CoM-SIA_EA) revealed various contour maps for understanding the essential features of structure-activity relationships of these molecules. The conventional hydrogen bond interactions with some key residues such as ARG118, GLU227, and ARG371 significantly made the 5-benzylidenethiazolinone moiety of the molecules to be more stable in both SAcavity and 430-loop cavity of the pH1N1 neuraminidase from the docking studies. Six (6) molecules (9, 10, 11, 17, 22, and 31) with relatively high inhibitory activities, docking scores, and ADMET properties were identified as the possible lead molecules that can be utilized as templates for the in-silico design of novel candidates for anti-influenza therapy. Thus, the outcome of this study overlaid a solid foundation for the insilico design and theoretical exploration of novel neuraminidase inhibitors with improved potency.

Declarations
Author contribution statement Mustapha Abdullahi: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.
Adamu Uzairu, Gideon Adamu Shallangwa: Conceived and designed the experiments; Analyzed and interpreted the data.
Paul Andrew Mamza: Analyzed and interpreted the data. Muhammad Tukur Ibrahim: Conceived and designed the experiments; Contributed reagents, materials, analysis tools or data.

Funding statement
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data availability statement
Data included in article/supplementary material/referenced in article.

Declaration of interests statement
The authors declare no conflict of interest.